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Abstract 

Temiar reduplication is a difficult piece of prosodic 
morphology. This paper presents the first com- 
putational analysis of Temiar reduplication, us- 
ing the novel finite-state approach of One-Level 
Prosodic Morphology originally developed by 
Walther (1999b, 2000). After reviewing both the 
data and the basic tenets of One-level Prosodic Mor- 
phology, the analysis is laid out in some detail, 
using the notation of the FSA Utilities finite-state 
toolkit (van Noord 1997). One important discovery 
is that in this approach one can easily define a regu- 
lar expression operator which ambiguously scans a 
string in the left- or rightward direction for a cer- 
tain prosodic property. This yields an elegant ac- 
count of base-length-dependent triggering of redu- 
plication as found in Temiar. 

1 Introduction 

Temiar is an Austroasiatic language of the Mon- 
Khmer group spoken by a variety of tribal people in 
West Malaysia (Benjamin, 1976). Its intricate mor- 
phological system has received some attention in the 
theoretical literature. The main focus has been on 
the aspectual morphology of verbs, where an inter- 
esting pattern of partial reduplication emerges that 
is sensitive to the size of the verbal root. For exam- 
ple, in the active continuative, gdggl 'to eat' redu- 
plicates both the initial /g/ and the final /U of its 
monosyllabic base gsl. In contrast, bisyllabic SQluh 
'to shoot' comes out as sehluh, where only the final 
/h/ is copied, this time as an infix. 

Temiar reduplication thus appears to be a suitably 
rich testing ground for a novel approach to redu- 
plication developed by (Walther, 1999b; Walther, 
2000) within a finite-state framework. Even though 
that approach, One-Level Prosodic Morphology, 
was presented from the outset as being generally 
applicable, it has been proven time and time again 
that only concrete empirical application of a par- 



ticular approach to computational morphology and 
phonology will fully reveal its inherent virtues and 
weaknesses. As an example, (Beesley, 1998) re- 
ports that it was actual experimentation with gram- 
mars of word-formation in Arabic and Hungarian 
which fully revealed the negative effects of mod- 
elling long-distance circumfixional dependencies in 
purely finite-state terms, subsequently leading to 
some suggestions for improvement. 

It is perhaps worth emphasizing that (Walther, 
1999b)'s solution for reduplication in a finite-state 
context is preferrable for cross-linguistic validation 
precisely because it is the first that solves the prob- 
lem in the general case. Because reduplication of- 
ten involves copying of a strictly bounded amount 
of material, the bounded case could in principle be 
modelled as a finite-state process by enumerating all 
possible forms of the copy and then making sure 
each was matched to the proper stem. To solve this 
simplified problem, no new techniques are needed 
in theory. In practice however, the brute-force enu- 
meration approach apparently has not been pursued 
further, apart from isolated examples (see Antworth 
(1990), p.l57f for a fixed-size case in Tagalog). This 
is probably because such an approach is awkward 
to specify in actual grammars and because it will 
inevitably lead to an explosion of the state space 
(Sproat (1992), p. 161). Finally and in contrast to 
(Walther, 1999b), it would clearly break down for 
productive total reduplication, which is isomorphic 
to the context-sensitive language {ww\w G 5]"^}. 

A second motivation for choosing Temiar is that 
all prior analyses of its data are heavily under- 
formalized and incomplete, irrespective of whether 
they are situated in the older rule paradigm (Mc- 
Carthy, 1982; Broselow and McCarthy, 1983; Sloan, 
1988; Shaw, 1993) or an optimality-theoretic setting 
(Gafos, 1995; Gafos, 1996; Gafos, 1998b; Gafos, 
1998a). Hence a formalized and computationally 
tested analysis that strives to keep a healthy balance 
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with respect to linguistic adequacy would represent 
significant progress on its own. 

In the rest of the paper I will attempt to provide 
just such an analysis, beginning in §2 with a presen- 
tation of the relevant data. Next, section §3 reviews 
the core of One-Level Prosodic Morphology, which 
will be used as formal background. Using that back- 
ground, the analysis is then fully developed in §4. 
The paper concludes with some discussion in §5. 

2 Temiar reduplication 

All data on Temiar reduplication in this section 
come from (Benjamin, 1976), the main source on 
the subject.^ According to Benjamin, the charac- 
teristic aspectual paradigms of "monosyllabic and 
schewa-form verbs" (B:168) are as follows (B:169): 



(1) 



'to call' 


'to lie down/sleep/marry' 


'monosyllabic' 


'schewa-form' 


I 'k33W 

1 ka.'koow 
e kew.'koow 


89. 'bg perfective 

sa.'bg simulfactive 
seg.'bg continuative 


ter.'koDw 

a 

^ to.ra.'koaw 
I tg.rew.'kDDW 


ser.'bg perfective 
S9.ra.bg simulfactive 
S9 . r eg . ' bg continuative 



We have inferred syllabifications in (1) from the 
statement that "only two types of syllables occur: 
open syllables of canonical form CV, and closed syl- 
lables of canonical form CVC" (B:141). Note that 
Benjamin abstracts from vowel length here. Word- 
level stress, which is "falling regularly on the final 
syllable" (B:139), is likewise inferred in (1). Ob- 
serve that only monosyllabic roots like koow redu- 
plicate their initial consonant in the non-perfective 
aspectual forms of the active, while longer roots like 
sdlog do not. This contrasts with obligatory redupli- 
cation of the root-final consonant in the continua- 
tive. 

An important further generalization is that all 
extra segmental material beyond the bare root is 
inserted immediately before the stressed syllable, 
leading to prefixation for monosyllabic roots, but in- 
fixation in polysyllabic ones (Gafos, 1998b). From 
this point of view we can also see a correlation be- 
tween the fact that causative forms of monosyllabic 
roots - which must be at least bisyllabic - begin 

'We will abbreviate further references to this work with 
"(B: <page number>)" in the text. Moreover, to highlight 
reduplicated parts in the data they will often be printed in bold. 



with a fixed /t/^ and the restriction that words must 
"always begin and end with a consonant" (B:141). 
In triconsonantal roots like Sdbg that restriction is 
taken care of by the first root consonant itself, so no 
fixed segment needs to appear. 

According to Benjamin, prefinal syllables - 
which are unstressed - can show alternation of their 
vocalic quality: "In prefinal closed syllables the in- 
ner vowels /e 9 o/ are replaced by the outer vowels /i 
E u/ respectively" (B:I44). This descriptive general- 
ization accounts for the remaining contrasts in (1), 
witness e.g. S9.bg versus seg.bg. 

It is interesting to see that Temiar even exhibits 
phonological modifications between base and redu- 
pUcant, affecting consonants in the continuative: 

(2) yaap — yem.yaap 'to cry' (B:143) 

p9t pEn.p9t 'to long for' (B: 146) 

S9.b3k^ seq.bok 'to hunt successfully' (B: 146) 

Benjamin explains that medial coda consonants 
from the class of oral voiceless stops turn into their 
voiced nasal equivalents in Northern Temiar (and to 
plain voiced stops in the Southern dialect; B:I43). 

It is of some importance to clarify a number 
of further aspects of the data and their interpreta- 
tion. First, theorists have frequently employed the 
stronger term 'minor syllables' for Benjamin's pre- 
final syllables, reflecting their alleged special status 
by means of an impoverished representation (e.g. 
empty syllable nuclei in (Gafos, 1998b)) and/or fur- 
ther formal mechanisms (e.g. a ban on full vowels 
in prefinal position *Prefinal-V (Gafos, 1998a)). 
We do not follow this move here, because empiri- 
cally it is neither true that penultimate vowels are 
categorically restricted to schwa-like vowels (lia- 
lab 'to go downriver', sindul 'to float', etc.) nor 
are there any solid statistics of a presumed tendency 
to vowel reduction in unstressed syllables, nor can 
the variable quality of prefinal vowels be consis- 
tently derived from flanking consonants. Hence, 
such penultimative vowels are to be lexically speci- 
fied as alternating. 

Second, Benjamin's subclass restriction of (1) 
to "monosyllabic and schewa-form verbs" correctly 
excludes polysyllabic roots like the already men- 
tioned halab and sindul, where prefinal open syl- 
lables with vowels outside of /e 9 o/ occur. These 
roots undergo "very few morphological changes" 
(B:170), basically proclitization. 

^Or /b/, if the root starts in /c,t/: /caa?/ 'to eat' gives 
/ber.caa?/ 'to feed' (B:169). 
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Third, paradigms for a given root are hardly 
ever complete, with various irregularities and non- 
productive patterns also occuring (B:169f). Again, a 
good deal of lexicalization would seem necessary to 
correctly describe Temiar verbs in a realistic gram- 
mar fragment. 

Given this descriptive summary, our goals for 
the upcoming analysis are, first, to treat the full 
paradigm of (1). As a second goal, we would hke 
to reflect the emergent formal desiderata in a trans- 
parent way, in particular referring to the need to 
account for repetition, truncation, infixation and 
phonological modification. Thirdly, we will attempt 
a compositional analysis of the morphological ex- 
ponency of aspect. 

3 One-Level Prosodic Morphology 

In order to provide the necessary background for 
the Temiar analysis in §4, this section briefly re- 
views the finite-state approach to prosodic morphol- 
ogy developed in (Walther, 1999b), 

That work itself was presented as an extension 
to (Bird and Elhson, 1994)'s One-Level Phonol- 
ogy framework, where phonological representa- 
tions, morphemes and more abstract generalizations 
are all finite-state automata that express surface-true 
constraints on word forms, and constraint combina- 
tion is by automata intersection. 

In a nutshell, the extension comprises three main 
components. We (i) represent phonological strings 
differently for purposes of modelUng prosodic mor- 
phology, (ii) implement reduplicative coyping by 
automata intersection, and (iii) introduce a resource- 
conscious variant of automata. 

For (i), operators are provided that construct en- 
riched automata from a simple string automaton, in 
particular giving it a kind of doubly-linked structure 
so that the symbol repetition inherent in redupli- 
cation translates into following backwards-pointing 
technical transitions. The individual enrichments in- 
volve only local computation per state or transition, 
so that on-the-fly implementation is easy if desired. 
In other words, one does not necessarily have to en- 
rich the entire lexicon in advance. 

Enriched representations In a bit more de- 
tail, the enrichments of (i) are as follows. The 
three aspects of reduplication or symbol repeti- 
tion, truncation or symbol skipping and infixation 
or transitive, non-immediate precedence of sym- 
bols are reflected in three regular expression op- 
erators, addjrepeats, addskips, addself doops. 



Each takes the underlying automaton ^ of a regular 
language as its only argument. Formally, they 
can be defined as follows: 

(3) Let A = (Q, S, 5, QO) -^) be the minimal e- 
free^ finite-state automaton for La, with Q a 
finite set of states, finite alphabet S, transition 
function (5 : (5 X S I— > 2*5, start state qo & Q 
and set of final states F C Q. 

a. Assume repeat ^ E. 

def 

addjrepeats{A) = {Q,T,' ,5' ,qo, F), 
where S' = E U {repeat}, 
yx eT,yq £ Q: S'{q,x) = 5{q,x) and 
Vp G Q:6'{p, repeat) = {q\p & d{q,x)} 

b. Assume skip S. 

add-skips{A) {Q,T,' ,6' ,qo, F), 

where E' = S U {skip}, 
Vx G EVg G Q: S'{q,x) = 5{q,x) and 
Vg G Q: 6'{q, skip) = S{q, x) 

def 

c. add_selfJoops{A) = {Q,'E,5',qo,F), 
where 

S' = SU{iq,a, {q})\qeQ,ae^} 

An example enrichment of Temiar sobg is shown 
in figure I. One can imagine how skip and repeat 
transitions allow, figuratively speaking, forward and 
backward movement within a string, while self 
loops will absorb infixal morphemes that are inter- 
sected with fig. I . Finally, so-called synchronization 
bits :1, :0 were introduced in (Walther, 1999b) to 
define the extent of a reduplicative base constituent 
in a segment-independent way. Bit value :1 marks 
the edges and :0 the interior segments of a base, as 
shown in fig. 1 for a hypothetical whole-root redu- 
plication pattern. In actual practive, synchronization 




Figure I : add_repeats(add_skips(add_self _loops(selog))) 

bits are sets of symbols, just like the rest of the al- 
phabet. Sets as transition labels improve over tra- 
ditional automata in terms of automata compact- 
ness, were already proposed for phonology in (Bird 

'Minimality prevents non-(co)-accessible transitions from 
getting enriched, while lack of e transitions keeps positional 
skip/repeat 'movement' in lockstep with segmental positions. 
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and Ellison, 1992) and do not increase mathematical 
expressivity beyond regular languages."* Hence, the 
segmental part of fig. 1 may be defined in a modular 
fashion through the intersection of strings of sym- 
bol sets that mention only certain dimensions (here: 
phonemes and synchronisation bits), being under- 
specified for the unmentioned dimensions. We will 
again follow (Walther, 1999b) in conceiving of sets 
as types arranged in a type hierarchy that is struc- 
tured by set inclusion, and also in allowing arbitrary 
boolean combinations of types. 

Copying as intersection Given enriched repre- 
sentations as in fig. 1, various patterns of reduph- 
cation are now easy to define. We can denote a syn- 
chronised abstract string by the regular expression 

base = seg:l seg:0* seg:l 

where seg is the type subsuming all phonological 
segments. Then hypothetical total reduplication - 
unattested in Temiar, but wellknown from Indone- 
sian and many other languages - is described by 

total = base repeat* base 

A variant slightly more akin to Temiar - and actu- 
ally attested in the neighbouring language Semai - 
that skips the interior of the base in a prefixed redu- 
plicant is just as easy: 

semai = seg:l skip* seg:l repeat* base 

Ignoring self loops for the moment, all we need 
now to apply a reduplication pattern to an enriched 
base representation is simply to intersect the former 
with the latter: automata intersection has sufficient 
formal power to implement redupUcative copying! 
Here is an example, using the abbreviation selog = 
s:l e:0 1:0 o:0 g:l for perspicuous display: 

add-repeats{selog) fl total = selog repeat^ selog 

As pointed out in (Walther, 2000), generaliz- 
ing to a set of bases involves nothing more than 
enriching each base separately, then forming the 
union of the resulting automata. The opposite or- 
der would produce unwanted cross-string repetition, 
since addjrepeats does not distribute over union. 
However, an unpublished experiment shows that 
on-demand implementation of a sUghtly modified 

"^Of course, the identity requirement for matching transi- 
tions in traditional automata intersection must be replaced by 
a non-empty intersection requirement for set-based matching. 



addjrepeats can help to preserve the memory ef- 
ficiency of building a minimized base lexicon as the 
union of individual base strings first. Due to lack of 
space, the details will be reported elsewhere. 

Resource consciousness As much as we need the 
formal means provided by self loops for infixations 
like Temiar s-a-bg, the resulting automata over- 
generate massively. What's missing according to 
(Walther, 1999b) is a distinction between explic- 
itly contributed, independent information (e.g. the 
infix -a- itself) and contextual, dependent informa- 
tion that is tolerated but must be provided by other 

constraints (e.g. the 1^1 self loop that hosts the 
infix). Therefore, a parallel distinction between two 
kinds of symbols - producers and consumers - was 
introduced. In that scenario a symbol represents an 
information resource that needs to be produced at 
least once, then can be consumed arbitrarily often. 
To utilize the distinction, an additional P/C bit ac- 
companies symbols, with P/C = 1 for producers. All 
symbols introduced by the three enrichment oper- 
ators are consumers. Furthermore, automata inter- 
section is made aware of these resource-conscious 
notions by splitting it into two variants: In open 
interpretation mode, P/C bits of matching symbols 
are combined by logical OR, so that a result transi- 
tion will be marked as a producer whenever at least 
one argument transition is a producer. In closed in- 
terpretation mode, combination is by logical AND 
instead, allowing only producer-producer matches. 
Grammatical evaluation can then be characterized 
as follows: 

(Lexicon Ho-pen Constrainti • • • Hopen Constraint 7V ) ridosed S 

Here and elsewhere, producers are in bold print. 
Note the final intersection with the universal pro- 
ducer language, which eliminates unused consumer 
transitions, the main source of overgeneration. 

4 The analysis 

We have assembled enough background now to pro- 
ceed to the actual analysis of the Temiar data in (1). 
The analysis is implemented using FSA Utilities, a 
finite-state toolbox written in Prolog which encour- 
ages rapid prototyping (van Noord, 1997). Figure 2 
shows a relevant fragment of its syntax (extensions 
and modifications in italics). 

In displaying the grammar, we will take liberty 
in suppressing certain definitions in the interest 
of conciseness, relying on the mnemonic value of 
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{ } 

[E1,E2, ... ,En] 
{E1,E2, ... ,En} 
E* 
E' 
El & E2 
A -'-> { B / C) 

-r-> 

Head(argl, . . . , argN) := Body 



empty language 

concatenation 

union 

Kleene closure 
optionality 
intersection 
monotonic rules 

set complement 
macro def. 



Figure 2: Regular expression operators 

their names instead. A case in point is producer (t) , 
consumer (T): sincc the namcs are self-explanatory, 
it suffices to note that the only argument t con- 
tains type formulae that denote the symbol sets, as 
explained before. Allowable type-combining opera- 
tors are conjunction s, disjunction ; and negation ~. 
The same goes for monotonic rules, which - unlike 
rewrite rules - can only specialize their focussed 
segmental position a to b. They exist in two variants, 
where a -r-> b/c notates the case where context c 

is right-adjacent to the focus {A B/ C), and 

vice versa for a -i->- b/c. 

Syllabification To define the reduplicant in 

prosodic terms later on, we need syllabification 
in the first place. Here a simplified finite-state ver- 
sion of a proposal by (Walther, 1999a) is employed. 
Its key idea is to allow incremental assignment of 
syllable roles to segmental positions via a featural 
decomposition of the three traditional roles, using 
two binary- valued features ons and cod: 

(4) 



Onset 


ons 


^cod 


Nucleus 


~ons 


~cod 


Coda 


~ons 


cod 


CodaOnset 


ons 


cod 



As a side-effect, one gets the fourth role CO, 
a monosegmental prosodic representation of 
true geminates. The subcomponent sbs, for 
sonority-based syllabification, itself rests on 
the computation of sonority-differences be- 
tween adjacent segmental positions (not shown), 
where sonority may either go up or down. To- 
gether with some self-explanatory constraints 

obligatory _wordinternal_onsets and no.geminates, 

prosodic surface wellformedness is then wellde- 

fined. Only if_doubly-synced-edge-then_stressed 

may seem slightly odd, since it has a purely 
technical character: it rules out certain informed 
alternatives in wordforms. Note, however, that the 



necesssity of such technical constraints, which are 
certainly implicit in informal analyses as well, can 
only be reliably detected in computerized analyses 
such as the present one, which allow for mechanical 
enumeration of a grammar's denotation. 

sbs := [ { [consumer (downs ~ons) , 

consumer (segments ~ ' Nuc' ) ] , 
[consumer (up&~'Nuc' ) , 
consumer ( segment & ~ cod) 
} *, no_f inal_onset "] . 



no_initial_coda 
no_f inal_onset : 



= consumer (segments ~cod) 
consumer (segmentS~ons) . 



syllabification := sonority_dif f erencesS 
sbsS [no_initial_coda, sbs] . 

% — further constraints 

obligatory_wordinternal_onsets : = 

( segment -r-> ons / 'Nuc' ) . % _ 'N' 



no_geminates 



consumer ( ~ ' CO' ) * . 



prosodic_constraints := obligatory_word- 
internal_onsets S no_geminates & 
if_doubly_synced_edge_then_stressed. 

if_doubly_synced_edge_then_stressed : = 
[ ( { consumer ("':!'), 

[consumer ( ' : 1 ' ) , consumer ( ~ ' : 1 ' ) ] , 
[consumer (' : 1' ) , consumer ( ' : 1 ' ) , 
consumer (stressed) ] 
} *) , consumer (': 1' ) 

Stress Given the assignment of syllable roles to 
segmental positions, we are now ready to define 
Temiar word stress. A possibly empty sequence 
of pref inal-syllables, each of which is constrained 
to be of shape ON{C) and unstressed, IS fol- 
lowed by a final stressed syllable. The macro 
endsjbefore.iast.syii makes sure that the dividing 
line between the penultimate and ultimate syllable 
is drawn correctly. 

stress := [pref inal_syllables S 

e nds_be f o r e_l a s t_s y 11 , 
syllable] . 

pref inal_syllables := 

( [ consumer ( ' Ons ' ) , consumer ( ' Nuc' ) , 
(consumer (' Cod' ) " ) ]*) S 
consumer (unstressed) * . 

ends_bef ore_last_syll := ( [consumer (segment) ' 
consumer (segments "ons ) ]") . 

syllable := [ consumer (ons )+, consumer (' Nuc' ) 
consumer (cod) * ] S 
(consumer (stressed) *) . 
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Stems We proceed towards the definition of & stem 
by noting that - as described in §2 - both the extent 
of a base's phonoIogical material and its stress pat- 
tern are necessary prior knowledge for adding as- 
pectual morphemes in the appropriate way. Hence, 
we impose the respective constraints onto the iso- 
lated base string in stemo, before wrapping the re- 
sult in the usual enrichments. However, the addi- 
tion of self loops for infixation this time is a pri- 
ori restricted to the position immediately before a 
stressed onset, in accordance with the descriptive 
generalization stated in §2. Experiments have shown 
that using the unrestricted addsel f loops of (3.c) 
would cause much unnecessary hassle in a posteri- 
ori restriction of the possible infix locations to the 
actually attested ones. It thus appears that Temiar 
provides a first case for further parametrization of 
at least one of the original operators from (Walther, 
1999b): 



base := [ consumer (': 1 '), consumer (': 0' )* , 
consumer ( ' : 1 ' ) ] . 

stemO (StemMaterial) := 
add_self_loop_bef ore (stressed&'Ons' , 
add_repeats (add_skips (StemMaterial & 
base & syllabification & 
prosodic_constraints & stress) ) ) . 

stem ( Segment s ) := 
stemO (stringToSegments (Segments) ) . 



Definitions for the actual stem entries of seiog, 
koow, yaap are shown below, using the ASCII-IPA 
mapping {@ i-^ 9, E i-^ e, o}. In eval- 
uating the first entry, the schwa actually trans- 
lates into a producer-type disjunction (3;e) with 
the help of StringToSegments. It thus makes sense 
to constrain this free alternation further, which 

is the purpose of has.prefinal.syllable. While 

the monosyllable koow needs no extra treatment, 
yaap is an example of a stem ending in an 
aiternating.iabiai, whose definition howcver is 
straightforward (medial, final refer to a positional 
classification of the word that is defined later): 



selog := stem ( " sSlOg" ) & 

has_pref inal_syllable . 

koow := stemC'kOOw") . 

yaap := stemO ([ stringToSegments ( "yaa" ) j 
alternating_labial] ) . 

alternating_labial := {producer (p&final) , 
producer (m&medial&cod) } . 



If we now define has.prefinal.syllable itself, We 

have completed the components that make up 
stem. While the definition really targets the prefinal 
vowel, its preceding onset and the stretch of arbi- 
trary material after it must also be mentioned. To 
tolerate interspersed technicai-symbois, the ignore 
operator is used (Kaplan and Kay, 1994). 

The purpose of prefinai.v is to control the al- 
ternation between 'outer' and 'inner' vowel, here 
parametrized for e~9 only. It does so by referenc- 
ing the next syllable role: if it is consistent with 
ons, that vowel resides in an open syllable, hence 
the ciosejnid Variant (g) will be selected. Two else- 
where cases deal with closed syllables and the pos- 
sible presence of a technical symbol: 



has_pref inal_syllable := 
ignore ( [consumer ( ' Ons ' ) , 

prefinal_V( ('E' ; ' @' ) , 
' : 0' Sunstressed) , 
consumer (anything) *] , 
technical_symbols ) . 

technical_symbols := 
(consumer ( (skip; repeat) ) *) . 

prefinal_V (Quality, Common) := 
{ [producer (QualitY&close_mid&Common) , 

consumer (ons) ] , 

[producer (Quality & ~ do se_mid& Common) , 
consumer (cod) ] , 
[ consumer ( (skip; repeat ) ) ] 
} ) . 



Aspectual affixes It is time to concentrate on the 

most interesting part, and that is how to define the 
affixes. Again the general picture will be to see them 
as constraints on word forms which are imposed by 
intersection. We begin with the simuifactive. The 
claim here is that its characteristic pattern is the real- 
ization of the initial base segment (:i), followed by 
the infixed melodic element /a/, and then the entire 
string that begins with the stressed onset. Phrasing 
the pattern this way akeady suffices to capture the 
difference in reduplication behaviour between ifoow 
and SQ'bg: if we have inserted the -a- after the ini- 
tial consonant in the first base, the sttessed onset is 
to the left of/a/'s position, whereas in the second 
base that onset is found to the right. Thus, repeti- 
tion of segments is necessary to avoid ungrammat- 
icality due to constraint violation in the first case 
(k-a-'k33w), but not in the second (s-a-'bg). 

This behaviour is most naturally modelled by 
defining a new operator seek (x) , which allows for 
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ambiguous movement either to the left (repeat) or 
to the right (skip) before imposing the restriction 
X. This operator is applied to infixal /a/ because 
it is precisely the infix which needs to 'seek' its 
prosodically defined unique insertion point, i.e. self 
loop. Finally, to ensure that the other aspectual mor- 
phemes can play their part later on, the entire pat- 
tern is wrapped in align to tolerate further material 

before (align.right) and after it (align.left): 



simulf active := 

align ( [consumer ( ' : 1 ' ) , 

seek ( [producer (a&' : 0' Sunstressed) , 
consumer (stressed&' 0ns' )])]). 

seek(X) := 

[{producer(skip)*,producer(repeat)*},X] . 

align_lef t (X) : = [X, consumer (anything) * ] . 
align_right (X) := [consumer (anything) *, X] . 
align (X) := align_right (align_left (X) ) . 



Moving on to the continuative, wc Can see that the 
relevant formal generalization is a bit more com- 
plex. Again we start off with the initial base segment 
(:i), but then seek a place to infix the constant Id, 
before we skip.to the next synchronised base posi- 
tion ( : i), which inevitably will be the final one. The 
pattern is completed by again seeking the stressed 
onset, from which realization of the string proceeds 
uninterrupted due to the licensing of extra material 
that the align wrapper provides. This produces a 
similar contrast with respect to (non-)reduphcation 
of the first base position, but makes both the rep- 
etition of the last base segment and the truncation 
of its interior material obligatory in both base types 
{k-e- -ee- w'foaw vs. s-^- ie- g bg): 



continuative := 
align ( [ consumer ( ' ■.!'), 

seek( [producerCE's' :0'& unstressed) ] ) , 
skip_to (consumer ( ' : 1 ' ) ) , 
seek (consumer (stresseds' 0ns' ) ) ] ) . 

skip_to (X) := [producer (skip) +, X]. 



What is left now is the proper definition of the 
causative. Here we observe from (1) that the 
causative morphology always starts word-initial, 
hence the use of aiign.ieft. We have a default con- 
sonant /t/ whose realization we must somehow force 
in the monosyllabic roots. Next comes a vowel, 
whose quality - 3 or e - is again regulated by the 



familiar has.pref inai.syiiabie. Finally, the charac- 
teristic fixed element /r/ is specified. Upon second 
thought, the /t/ is guaranteed to appear in mono- 
syllable roots, because prefinal syllables always re- 
quire an onset. The default absence of the /t/ - when 
not needed on prosodic grounds - is again encoded 
by the producer/consumer distinction, which con- 
trasts the two disjuncts of the parametrized macro 

default: 



causative := 
align_left { [default (t&unstressed, ' : 1' ) , 
producer (vowel) , 
producer (r&' : 1' &unstressed) ] ) & 
has_pref inal_syllable . 

default (Optional, Common) := 
{ producer (Common&Optional) , 
consumer (Common) } . 



Entire words We can put the pieces together now 
by first defining the word constraint as the con- 
junction of syllabification and related prosodic con- 
straints plus a classification of the word's segmental 
positions into initial, medial, final ones. Again, 
this is modulo interspersed repeat or skip sym- 
bols. This actually means that base syllabification 
and word syllabification must match up, but fortu- 
nately this is indeed a property of our Temiar data. 

Second, wordform conjoins the previous constraint 
with its parameter x - which will contain the con- 
junction of stem and aspect morphemes -, before 
eliminating leftover consumer symbols with the 

help of closed-interpretationl 

word := ignore ( syllabification & 

prosodic_constraints & 
positional_classif ication, 
technical_symbols ) . 

positional_classif ication := 
[consumer (initial) , consumer (medial) *, 
consumer ( final ) ] . 

wordform (X) : =closed_interpretation (X&word) . 



These definitions have removed the last barrier 
to evaluating expressions like wordform (seiog & 

simulfactive S causative) Or CVCn suitable dis- 
junctive combinations of such expressions which 
define entire paradigms. Figure 3 shows an example 
automaton for three forms. We refrain from describ- 
ing a final automaton operation called Bounded Lo- 
cal Optimization in (Walther, 1999b) that was put 
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Figure 3: Temiar reduplications sdrabg, kakjjw, yemyaap 



to use here to filter liarmless spurious ambiguities 
from the original version of fig. 3. The kind of am- 
biguity involved in our Temiar grammar is one of al- 
ternative distribution of technical symbols in strings 
of the same segmental-content yield. Suffice to say 
that a simple parametrization of Bounded Local Op- 
timization, which could only look at length- 1 transi- 
tion paths emerging from any given state, was able 
to prune the unwanted alternatives by considering 
technical transitions costlier in weight than segmen- 
tal transitions. 

5 Conclusion 

The present paper has provided further support for 
(Walther, 1999b)'s finite-state conception of One- 
Level Prosodic Morphology by formulating - for 
the first time - a fully formalized and computa- 
tional analysis of a complicated piece of reduplica- 
tive morphology found in the Mon-Khmer language 
of Temiar. Compared to the initial proposal, all 
three core components of enriched representations, 
namely technical transitions for repeating or skip- 
ping segmental symbols and the ability to perform 
infixation by using self loops, were again found 
necessary in the course of this analysis. However, 
in Temiar the last enrichment - addself JLoops - 
needed to be parametrized for a prosodic condition 
to narrow down the insertion site to a unique posi- 
tion per base. 

The prosodic condition of 'stressed onset' proved 
crucial to define that position, and accounted for the 
variation between infixing aspectual morphology in 
longer bases and descriptively prefixing morphol- 
ogy in monosyllabic ones. Temiar thus underscores 
the utility of computing with real prosodic informa- 
tion in finite-state morphology, a frequently miss- 
ing desideratum according to (Sproat, 1992, p. 170). 
Also, the symmetry of having both forward and 



backward-pointing technical transitions in enriched 
automata representations was exploited in a novel 
regular expression operator called seek(x), which 
encapsulated an interesting kind of ambiguous di- 
rectional movement (or: movement underspecified 
for direction) towards a position satisfying property 
X. This operator could rather directly be motivated 
from the data. In particular, it facilitated an insight- 
ful account of the base-length-dependent triggering 
of reduplication in the active simulfactive aspect. 

Finally, in contrast to even the most recent anal- 
yses in the theoretical linguistic literature, the full 
paradigm including the causative forms was cap- 
tured in this fairly complete analysis, together with 
phonological modifications that sometimes occur 
between base and reduplicant, as exemplified by 
jemjaap. Apart from an optional filtering step 
for some technical spurious ambiguities that could 
make use of local optimization, neither global op- 
timization nor violable or soft constraints of the 
type argued for in Optimality Theory (Prince and 
Smolensky, 1993) were found necessary. 

For future research, the empirical base of Temiar 
should be broadened to include further reduplica- 
tion patterns, in particular those found in expres- 
sives. Also, the grammar should be amended to al- 
low for words containing geminates, which were 
initially excluded to simplify the overall analysis at 
the cost of what is at best a peripheral aspect of it. 
Because the finite-state constraints employed in this 
work are all surface-true, the potential of machine- 
learning techniques to acquire them automatically 
from surface-oriented corpora should be explored. 
Finally, it would be very interesting to broaden to 
Temiar the ongoing experiments with efficiency- 
oriented computational variants of the One-Level 
Prosodic Morphology framework that were already 
alluded to in the text. 
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